juan camilo orduz
Introduction to Bayesian Modeling with PyMC3 - Dr. Juan Camilo Orduz
We can also see this visually. We can verify the convergence of the chains formally using the Gelman Rubin test. Values close to 1.0 mean convergence. We can also test for correlation between samples in the chains. We are aiming for zero auto-correlation to get "random" samples from the posterior distribution.
Exploring the Curse of Dimensionality - Part II. - Dr. Juan Camilo Orduz
I continue exploring the curse of dimensionality. Following the analysis form Part I., I want to discuss another consequence of sparse sampling in high dimensions: sample points are close to an edge of the sample. This post is based on The Elements of Statistical Learning, Section 2.5, which I encourage to read! Consider \(N\) data points uniformly distributed in a \(p\)-dimensional unit ball centered at the origin. Suppose we consider a nearest-neighbor estimate at the origin.
Exploring the Curse of Dimensionality - Part I. - Dr. Juan Camilo Orduz
In this post I want to present the notion of curse of dimensionality following a suggested excercise (Chapter 4 - Ex. 4) of the book An Introduction to Statistical Learning, writen by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. When the number of features \(p\) is large, there tends to be a deterioration in the performance of KNN and other local approaches that perform prediction using only observations that are near the test observation for which a prediction must be made. This phenomenon is known as the curse of dimensionality, and it ties into the fact that non-parametric approaches often perform poorly when \(p\) is large. We will now investigate this curse. Let us prepare the notebook.